62 research outputs found

    A novel access pattern-based multi-core memory architecture

    Get PDF
    Increasingly High-Performance Computing (HPC) applications run on heterogeneous multi-core platforms. The basic reason of the growing popularity of these architectures is their low power consumption, and high throughput oriented nature. However, this throughput imposes a requirement on the data to be supplied in a high throughput manner for the multi-core system. This results in the necessity of an efficient management of on-chip and off-chip memory data transfers, which is a significant challenge. Complex regular and irregular memory data transfer patterns are becoming widely dominant for a range of application domains including the scientific, image and signal processing. Data accesses can be arranged in independent patterns that an efficient memory management can exploit. The software based approaches using general purpose caches and on-chip memories are beneficial to some extent. However, the task of efficient data management for the throughput oriented devices could be improved by providing hardware mechanisms that exploit the knowledge of access patterns in memory management and scheduling of accesses for a heterogeneous multi-core architecture. The focus of this thesis is to present architectural explorations for a novel access pattern-based multi-core memory architecture. In general, the thesis covers four main aspects of memory system in this research. These aspects can be categorized as: i) Uni-core Memory System for Regular Data Pattern. ii) Multi-core Memory System for Regular Data Pattern. iii) Uni-core Memory System for Irregular Data Pattern. and iv) Multi-core Memory System for Irregular Data Pattern.Les aplicacions de computació d'alt rendiment (HPC) s'executen cada vegada més en plataformes heterogènies de múltiples nuclis. El motiu bàsic de la creixent popularitat d'aquestes arquitectures és el seu baix consum i la seva natura orientada a alt throughput. No obstant, aquest thoughput imposa el requeriment de que les dades es proporcionin al sistema també amb alt throughput. Això resulta en la necessitat de gestionar eficientment les trasferències de memòria (dins i fora del chip), un repte significatiu. Els patrons de transferències de memòria regulars però complexos així com els irregulars són cada vegada més dominants per a diversos dominis d'aplicacions, incloent el científic i el processat d'imagte i senyals. Aquests accessos a dades poden ser organitzats en patrons independents que un gestor de memòria eficient pot explotar. Els mètodes basats en programari emprant memòries cau de propòsit general i memòries al chip són beneficioses fins a cert punt. No obstant, la tasca de gestionar eficientment les transferències de dades per a dispositius orientats a throughput pot ser millorada oferint mecanismes hardware que explotin el coneixement dels patrons d'accés de les aplicacions, així com la planificació dels accessos a una arquitectura de múltiples nuclis. Aquesta tesis està enfocada a explorar una arquitectura de memòria novedosa per a processadors de múltiples nuclis, basada en els patrons d'accés. En general, la recerca de la tesis cobreix quatres aspectes principals del sistema de memòria. Aquests aspectes són: i) sistema de memòria per a un únic nucli amb patrons regulars, ii) sistema de memòria per a múltiples nuclis amb patrons regulars, iii) sistema de memòria per a un únic nucli amb patrons irregulars, iv) sistema de memòria per a múltiples nuclis amb patrons irregulars

    Muslim women who veil and Article 9 of the European Convention on Human Rights: A socio-legal critique

    Get PDF
    Islamic veiling has been the subject of many theological, social and legal debates, which are fluid and their intensity has been further influenced by its contextualised meanings such as religiosity, modesty, identity, resistance, protest, choice and subjugation. Literature on Muslim veiling has either examined its treatment by legal or socio-feminist perspectives, whereas this thesis critiques the religious, socio-feministic and the legal discourses. The contemporary discourse is dominated by competing binaries that label it as a tool of oppression or one of empowerment. Many of the assertions are based not on the veil’s multiple meanings or the wearer’s true motivations but on misplaced assumptions of moral authority by those who oppose or defend the practice, as well as native informants professing to represent veiled Muslim women, leaving Muslim veiled women’s voices muted. Having examined the religious imperative that has a patriarchal basis, the thesis constructs a critique of the two dominant discourses central to the contemporary debates on veiling. One discourse defends the practice as empowering whilst the other calls for prohibitions on the practice using liberation from oppression as a justification, particularly with issues surrounding the wearing of the full face veil. This is followed by a critique of the key cases generated under Article 9 ECHR, which attempts to balance the religious rights of those who veil with the rights of others. The case law highlights that the ECtHR not only falls short in disclosing satisfactorily how it has struck a balance between these competing rights, but also fails to adopt a neutral stance to religious expression through symbols, its reasoning being based on contradictory stereotypes of Muslim women as passive and victims of gender oppression in need of liberation. The influence of such stereotypes and an inadequate application of the margin of appreciation doctrine have led the ECtHR in validating state prohibitions on the hijab and the full face veil, thereby failing to acknowledge the voices of the veiled women at the centre of a human rights claim, delivering a further blow to them. Post the case of S.A.S. v. France the ECtHR has exasperated this even further by allowing an abstract principle of ‘living together’ as a justification for the full face veil’s prohibition in public spaces, resulting in Article 9 rights of Muslim women who veil being endangered even further by the introduction of such an open-ended ground

    EMVS: Embedded Multi Vector-core System

    Get PDF
    With the increase in the density and performance of digital electronics, the demand for a power-efficient high-performance computing (HPC) system has been increased for embedded applications. The existing embedded HPC systems suffer from issues like programmability, scalability, and portability. Therefore, a parameterizable and programmable high-performance processor system architecture is required to execute the embedded HPC applications. In this work, we proposed an Embedded Multi Vector-core System (EMVS) which executes the embedded application by managing the multiple vectorized tasks and their memory operations. The system is designed and ported on an Altera DE4 FPGA development board. The performance of EMVS is compared with the Heterogeneous Multi-Processing Odroid XU3, Parallela and GPU Jetson TK1 embedded systems. In contrast to the embedded systems, the results show that EMVS improves 19.28 and 10.22 times of the application and system performance respectively and consumes 10.6 times less energy.Peer ReviewedPostprint (author's final draft

    AMC: Advanced Multi-accelerator Controller

    Get PDF
    The rapid advancement, use of diverse architectural features and introduction of High Level Synthesis (HLS) tools in FPGA technology have enhanced the capacity of data-level parallelism on a chip. A generic FPGA based HLS multi-accelerator system requires a microprocessor (master core) that manages memory and schedules accelerators. In a real environment, such HLS multi-accelerator systems do not give a perfect performance due to memory bandwidth issues. Thus, a system demands a memory manager and a scheduler that improves performance by managing and scheduling the multi-accelerator’s memory access patterns efficiently. In this article, we propose the integration of an intelligent memory system and efficient scheduler in the HLS-based multi-accelerator environment called Advanced Multi-accelerator Controller (AMC). The AMC system is evaluated with memory intensive accelerators, High Performance Computing (HPC) applications and implemented and tested on a Xilinx Virtex-5 ML505 evaluation FPGA board. The performance of the system is compared against the microprocessor-based systems that have been integrated with the operating system. Results show that the AMC based HLS multi-accelerator system achieves 10.4x and 7x of speedup compared to the MicroBlaze and Intel Core based HLS multi-accelerator systems.Peer ReviewedPostprint (author’s final draft

    A High-Performance System Architecture for Medical Imaging

    Get PDF
    Medical imaging is classified into different modalities such as ultrasound, X-ray, computed tomography (CT), positron emission tomography (PET), magnetic resonance imaging (MRI), single-photon emission tomography (SPECT), nuclear medicine (NM), mammography, and fluoroscopy. Medical imaging includes various imaging diagnostic and treatment techniques and methods to model the human body, and therefore, performs an essential role to improve the health care of the community. Medical imaging, scans (such as X-Ray, CT, etc.) are essential in a variety of medical health-care environments. With the enhanced health-care management and increase in availability of medical imaging equipment, the number of global imaging-based systems is growing. Effective, safe, and high-quality imaging is essential for the medical decision-making. In this chapter, we proposed a medical imaging-based high-performance hardware architecture and software programming toolkit called high-performance medical imaging system (HPMIS). The HPMIS can perform medical image registration, storage, and processing in hardware with the support of C/C++ function calls. The system is easy to program and gives high performance to different medical imaging applications

    Impact of Time Taken on the Surgical Outcome of Extradural Hematoma in Patients with Road Traffic Accidents

    Get PDF
    Background: To determine the impact of time taken on the surgical outcome of extradural hematoma in patients with road traffic accidents.Methods: Sixty adult patients with history of road traffic accident with extradural hematoma on axial images of CT scan brain were included. All patients were allocated into three groups with 20 patients in each group. Patients in Group I were those in whom time from the occurrence of trauma to the surgical evacuation of hematoma was < 1 hour, 1 to 6 hours in group II and > 6 hours in group III.Results: In group I, majority (90 %) showed favourable outcome. In group II, 70 % showed favourable outcome. In group III, 50 % showed favourable outcome. Significant association was found between outcome and time of surgery (p<0.05).Conclusions: Frequency of favourable outcome after surgical evacuation was significantly higher in patients in whom surgery was performed within one hour after the trauma (P<0.05)

    Memory controller for vector processor

    Get PDF
    To manage power and memory wall affects, the HPC industry supports FPGA reconfigurable accelerators and vector processing cores for data-intensive scientific applications. FPGA based vector accelerators are used to increase the performance of high-performance application kernels. Adding more vector lanes does not affect the performance, if the processor/memory performance gap dominates. In addition if on/off-chip communication time becomes more critical than computation time, causes performance degradation. The system generates multiple delays due to application’s irregular data arrangement and complex scheduling scheme. Therefore, just like generic scalar processors, all sets of vector machine – vector supercomputers to vector microprocessors – are required to have data management and access units that improve the on/off-chip bandwidth and hide main memory latency. In this work, we propose an Advanced Programmable Vector Memory Controller (PVMC), which boosts noncontiguous vector data accesses by integrating descriptors of memory patterns, a specialized on-chip memory, a memory manager in hardware, and multiple DRAM controllers. We implemented and validated the proposed system on an Altera DE4 FPGA board. The PVMC is also integrated with ARM Cortex-A9 processor on Xilinx Zynq All-Programmable System on Chip architecture. We compare the performance of a system with vector and scalar processors without PVMC. When compared with a baseline vector system, the results show that the PVMC system transfers data sets up to 1.40x to 2.12x faster, achieves between 2.01x to 4.53x of speedup for 10 applications and consumes 2.56 to 4.04 times less energy.Peer ReviewedPostprint (author's final draft

    ViPS: Visual processing system for medical imaging

    Get PDF
    Imaging has become an indispensable tool in modern medicine. Various powerful and expensive platforms to study medical imaging applications appear in recent years. In this article, we design and propose a Visual Processing System (ViPS) that processes medical imaging applications efficiently. ViPS provides a user-friendly programming environment and high-performance architecture to perform image analysis, features extraction and object recognition for complex real-time images or videos. The data structure of image or video is described in the program memory using pattern descriptors; ViPS uses specialized 3D memory structure to handle complex images or videos and processes them on microprocessors or application specific hardware accelerators. The proposed system is highly reliable in terms of cost, performance, and power. ViPS based system is implemented and tested on a Xilinx Virtex-7 FPGA VC707 Evaluation Kit. The performance of ViPS is compared with the Intel i7 multi-core, GPU Jetson TK1 Embedded Development Kit with 192 CUDA cores based graphic systems. When compared with the Intel and GPU-based systems, the results show that ViPS performs real-time video reconstruction at 2x and 1.45x of higher frame rate, achieves 14.6x to 4.8x of speedup while executing different image processing applications and 20.3% and 12.6% of speedup for video processing algorithms respectively.Peer Reviewe

    Audio-Visual Speech Enhancement and Separation by Leveraging Multi-Modal Self-Supervised Embeddings

    Full text link
    AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective for categorical problems such as automatic speech recognition and lip-reading. This suggests that useful audio-visual speech representations can be obtained via utilizing multi-modal self-supervised embeddings. Nevertheless, it is unclear if such representations can be generalized to solve real-world multi-modal AV regression tasks, such as audio-visual speech enhancement (AVSE) and audio-visual speech separation (AVSS). In this study, we leveraged the pre-trained AV-HuBERT model followed by an SE module for AVSE and AVSS. Comparative experimental results demonstrate that our proposed model performs better than the state-of-the-art AVSE and traditional audio-only SE models. In summary, our results confirm the effectiveness of our proposed model for the AVSS task with proper fine-tuning strategies, demonstrating that multi-modal self-supervised embeddings obtained from AV-HuBERT can be generalized to audio-visual regression tasks.Comment: ICASSP AMHAT 202

    AMMC: advance multi-core memory controller

    Get PDF
    In this work, we propose an efficient scheduler and intelligent memory manager known as AMMC (Advanced Multi-Core Memory Controller), which proficiently handles data movement and computational tasks. The proposed AMMC system improves performance by managing complex data transfers at run-time and scheduling multi-cores without the intervention of a control processor nor an operating system. AMMC has been coupled with a heterogeneous system that provides both general-purpose cores and application specific accelerators. The AMMC system is implemented and tested on a Xilinx ML505 evaluation FPGA board. The performance of the system is compared with a microprocessor based system that has been integrated with the Xilkernel operating system. Results show that the AMMC based multi-core system consumes 48% less hardware resources, 27.9% less on-chip power and achieves 6.8x of speed-up compared to the MicroBlaze-based multi-core system.Peer ReviewedPostprint (author’s final draft
    • …
    corecore